EXTREMAL PROPERTIES OF (EPI)STURMIAN SEQUENCES AND 

DISTRIBUTION MODULO 1 



JEAN-PAUL ALLOUCHE AND AMY GLEN 

Abstract. Starting from a study of Y. Bugeaud and A. Dubickas (2005) on a question in 
distribution of real numbers modulo 1 via combinatorics on words, we survey some combi- 
natorial properties of (epi)Sturmian sequences and distribution modulo 1 in connection to 
their work. In particular we focus on extremal properties of (epi)Sturmian sequences, some 
of which have been rediscovered several times. 



1. Introduction 

A little while ago, JPA came across a paper of Y. Bugeaud and A. Dubickas [21] where the 
authors describe all irrational numbers £ > such that the fractional parts {£6 n }, n > 0, all 
belong to an interval of length 1/b, where b > 2 is a given integer. They also prove that 1/6 is 
the minimal length having this property. An interesting and unexpected result in their paper 
is the following: the irrational numbers £ > such that the fractional parts {£6™}, n > 0, all 
belong to a closed interval of length 1/b are exactly the positive real numbers whose base b 
expansions are characteristic Sturmian sequences on {k, k + 1}, where k £ {0, 1, . . . , b — 2}. 
(Recall that characteristic Sturmian sequences are codings of trajectories on a square billiard 
that start from a corner with an irrational slope; alternatively a characteristic Sturmian 
sequence can be obtained by coding the sequence of cuts in an integer lattice over the positive 
quadrant of M 2 made by a line of irrational slope through the origin.) We will see that the 
combinatorial results underlying [2T] were stated several times, in particular by P. Veerman 
who proved Bugeaud-Dubickas' number-theoretical statement in the case b = 2 as soon as 
1986-1987 (see [831 [M]). 

2. The combinatorial background of a result of Bugeaud and Dubickas 

The main result of Bugeaud and Dubickas [21^ Theorem 2.1] will be recalled in Section [6j 
Looking at the proof, we see that its core is a result in combinatorics on words that is 
encompassed by Theorems Q] and [2] below. 

2.1. Sturmian sequences show up. In this section sequences take their values in {0,1}. 
We let T denote the shift map defined as follows: if s := (s n ) n >o, then T(s) = T((s n ) n >o) := 
(s n +l)n>0) an d we let < denote the lexicographical order on {0, 1} N induced by < 1. 
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Theorem 1. An aperiodic sequence s := (s n ) n >o on {0, 1} is Sturmian if and only if there 
exists a sequence u := (u„) n >o on {0, 1} such that Ou < T k (s) < lu for all k>0. Moreover, 
u is the unique characteristic Sturmian sequence with the same slope as s, and we have 
Ou = inf{T fc (s), k > 0} and lu = sup{T fc (s), k>0}. 

Theorem 2. An aperiodic sequence u on {0, 1} is a characteristic Sturmian sequence if and 
only if, for all k > 0, 

Ou < T k (u) < lu. 

Furthermore, we have Ou = inf{T k (u), k > 0} and lu = sup{T k (u), k > 0}. 

[Theorem [2] is an easy consequence of Theorem [TJ For a proof of Theorem dj see Section [5TTJ] 

Actually Theorem [2] was known prior to [21]. It was indicated to JPA by G. Pirillo (who 
published it in [72]): JPA suggested that this could well be already in a paper by S. Gan [35] 
under a slightly disguised form (which is indeed the case). About 8 years earlier J. Berstel 
and P. Seebold [H] and also J. -P. Borel and F. Laubie [20] proved one direction of Theorem 
namely that characteristic Sturmian sequences satisfy the inequalities Ou < T k (u) < lu for 
all k > 0. In fact, it seems that both theorems were proved for the first time (including 
the number-theoretical aspect for the case of base 2) by P. Veerman \83\ 184] . For more on 
the history of that result (including other papers like [22J), see Section 15.11 (in particular 
Section E3J). 

2.2. Generalizations. Two directions for generalizations are possible. One is purely com- 
binatorial and looks at generalizations of Sturmian sequences; in particular episturmian se- 
quences, which share many properties with Sturmian sequences and have similar extremal 
properties. In this direction, characterizations of finite and infinite (epi)Sturmian sequences 
via lexicographic orderings have recently been studied (see [37J [381 HOI HH1 EH [721 E21 174"]). 
The other type of generalization is number-theoretic and looks at distribution modulo 1 from 
a combinatorial point of view. Recent papers of Dubickas go in this direction; we cite two of 
them showing an unexpected occurrence of the Thue- Morse sequence \29\ 130] (see Section [6]). 

3. More on Sturmian and episturmian sequences 
We give in this section some background on Sturmian and episturmian sequences. 

3.1. Terminology 8z notation. In what follows, we shall use the following terminology and 
notation from combinatorics on words (see, e.g., [65J). 

Let A denote a finite non-empty alphabet. If w = x\Xi ■ ■ ■ x m is a finite word over A, where 
each Xi S A, then the length of w is |io| := m, and we let \ w\ a denote the number of occurrences 
of a letter a in w. The word of length is called the empty word, denoted by e. The reversal 
w of w is given by w = x m x m -\ ■ ■ ■ x\, and if w = w, then w is called a palindrome. 

An infinite word (or simply sequence) x over A is a sequence indexed by N with values 
in A, i.e., x = xqX\X2 • • • , where each x% G A. A finite word w is a factor of x if w = e or 
w = Xi ■ ■ ■ xj for some i, j with % < j. Furthermore, if w is not empty, w is said to be a prefix 
of x if i = 0, and we say that w is right (resp. left) special if wa, wb (resp. aw, bw) are factors 
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of x for some letters a, b 6 A, a 7^ b. The set of all factors of x is denoted by F(x), and 
F n {x) denotes the set of factors of length n of x, i.e., F n {x) := F{x) n A n . Moreover, the 
alphabet of x is Alph(«) := F{x) n A. A factor of an infinite word a; is recurrent in a; if it 
occurs infinitely many times in x. The sequence x itself is said to be recurrent if all of its 
factors are recurrent in it. Moreover x is said to be uniformly recurrent (or minimal) if it is 
recurrent and if, for any factor, the gaps between its consecutive occurrences are bounded. 

If u, v are non-empty words over A, then if (resp. uv^) denotes the periodic (resp. ulti- 
mately periodic) infinite word vvv ■ ■ ■ (resp. uvvv ■ ■ ■) having \v\ as a period. An infinite word 
that is not ultimately periodic is said to be aperiodic. 

For any infinite word x = XQX1X2X3 • • • , recall that the shift map T is defined by T(x) = 
X1X2X3 ■ ■ ■ . This operator naturally extends to finite words as a circular shift by defining 
T{xw) = wx for any letter x and finite word w. 

The set of all finite (resp. infinite) words over A is denoted by A* (resp. A^ 1 ), and we define 
A + := A* \ {e}, the set of all non-empty words over A. 

3.2. Sturmian sequences. Sturmian sequences were introduced in [70]. They are in some 
sense the "least complicated" aperiodic sequences on a binary alphabet, as is evident from 
Lemma [3] and Theorem U] below. The following lemma can essentially be found in |70] . 

Lemma 3. [70] Let s be a sequence taking exactly a > 2 distinct values. Letp(k) be the number 
of distinct factors of length k of s (the function k 1— > p(k) is called the block-complexity of 
the sequence s). Then the following properties are equivalent. 

(i) There exists ko > 1 such that p(ko + 1) = p(ko). 

(ii) The sequence (p(k))k>i is ultimately constant (i.e., constant from some index on). 

(iii) There exists M such that p(k) < M for all k > 1. 

(iv) There exists k\ > 1 such that p{k\) < k\ + a — 2. 

(v) Let g(k) = p(k) — k. There exists k 2 > 1 such that g(k 2 + 1) < 5(^2) • 

(vi) The sequence s is ultimately periodic. 

Proof. For any sequence, we clearly have p(k + 1) > p(k) for all k > 0. This implies on 
the one hand that properties (ii) and (iii) are equivalent. On the other hand, this implies the 
equivalence of properties (i) and (v). Namely letting g{k) := p(k) — k, we have g{k+l)—g{k) = 
p(k+ 1) -p(k) - 1. 

The implications (vi) (ii) (iv) are straightforward. It thus suffices to prove that (iv) 
=^ (i), and (i) =^ (vi). 

(iv) =^ (i): if (i) is not true, then the sequence (p(k))k>o is (strictly) increasing. Thus, for all 
k > 1, one has p(k + 1) > p{k) + 1. Hence, by an easy induction, one has p(k) > p{\) + k — l = 
a + k — 1, i.e., p(k) > a + k — 2, for all k > 1. 

(i) =^ (vi): the equality p(ko + 1) = p(ko) shows that s has no right special factor of length 
ko. But this implies in turn that s has no right special factor of length k^ + 1 (such a factor 
would give a right special factor of length ko by removing its first letter). Iterating shows that 
s has no right special factor of length k, for any k > k^. This implies that s is ultimately 
periodic (s can be written as a concatenation of words of length ko and each of these words 
must always be followed by the same word). □ 
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We see from Lemma [3] above that an aperiodic sequence taking exactly a distinct values 
must satisfy p(k) > k + a — 1. The "simplest" aperiodic sequences would thus be sequences 
with the smallest p(k), i.e., sequences (if any) satisfying p(k) = k + 1 for all k > 1. Such 
sequences do exist; they are called Sturmian sequences. They are characterized in Theorem [J] 
below (see, e.g., |65j). Note that Sturmian sequences admit several equivalent definitions and 
have numerous characterizations; for instance, they can be characterized by their palindrome 
or return word structure [271 E3] • 

Theorem 4. For any infinite word s over {a, b}, the following properties are equivalent. If s 
satisfies these properties, then s is called Sturmian. 

• The number of factors of s of length n is equal to n + 1, for all n > 1. 

• There exist an irrational real number a > and a real number p, respectively called 
the slope and the intercept of s, such that s is equal to one of the following two infinite 
words: 

s a, P , s' a p : N -> {a,b} 

defined by 




a if [(n + l)a + p\ — [na + p\ = [a\ 
b if [(n + l)a + p\ — [na + p\ ^ \a\ 



, / \ \ a if \(n + l)a + p] - \na + p~\ = [a\ 

s api n ) = \ 

[b if \(n + l)a + p] - \na + p\ ^ [a\ 

for n > (where [x\ denotes the greatest integer < x and \x~\ denotes the least 
integer > x). Moreover, s is said to be characteristic Sturmian if p = a, in which case 

S — S aa — ^a,a' 

Example 5. Taking a = 0, b = 1, and a = p = (3 — \/5)/2, we get the characteristic Sturmian 
sequence 01001010. . ., which is called the (binary) Fibonacci sequence. 

Remark 6. By definition it is clear that any Sturmian sequence is over a 2-letter alphabet. 
It also follows from Lemma [3] that Sturmian sequences are aperiodic. Note that if we choose 
a to be rational in the above definition, we obtain (purely) periodic sequences, referred to as 
periodic balanced sequences - see below. (Some authors also use the name periodic Sturmian 
sequences.) We will call characteristic periodic balanced sequences those obtained with a ra- 
tional slope a > and intercept p = a in Theorem HI Also note that the names "slope" and 
"intercept" refer to the geometric realization of Sturmian words as approximations to the line 
y = ax + p (called mechanical words in j65j Chapter 2]). 

All Sturmian sequences are "balanced" in the following sense. 

Definition 7. A finite or infinite word w over {a, b} is said to be balanced if, for any factors 
u, v of w with \u\ = \v\, we have \\u\b — \v\b\ < 1 (or equivalently \\u\ a — \v\ a \ < 1). 

The term "balanced" is relatively new; it appeared in [191 02] (also see [651 Chapter 2]), 
and the notion itself dates back to [701 I24j . In the pioneering paper [70], balanced infinite 
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words over a 2-letter alphabet are called "Sturmian trajectories" and belong to three classes 
corresponding to: Sturmian; periodic balanced; and a class of non-recurrent infinite words 
that are ultimately periodic (but not periodic), called skew words. That is, the family of 
balanced infinite words consists of the Sturmian words and periodic balanced words (which 
are recurrent), and the (non-recurrent) skew infinite words, the factors of which are balanced. 
In particular, we have the following result due to Morse and Hedlund [70], and Coven and 
Hedlund [21] (see also [651 Theorem 2.1.3]): 

Theorem 8. A binary sequence is Sturmian if and only if it is balanced and aperiodic. 

Note. A description of skew words is given in part (ii) of Theorem [2TJ Simple examples are 
infinite words of the form t/6a w , where £ E N. 

It is important to note that a finite word is finite Sturmian (i.e., a factor of some Sturmian 
word) if and only if it is balanced [651 Chapter 2, Proposition 2.1.17]. Accordingly, balanced 
infinite words are precisely the infinite words whose factors are finite Sturmian. This concept 
is generalized in [30] by showing that the set of all infinite words whose factors are finite 
episturmian consists of the (recurrent) episturmian words and the (non-recurrent) episkew 
infinite words (i.e., non-recurrent infinite words, all of whose factors are finite episturmian), 
see Section [3.3.21 

For a comprehensive introduction to Sturmian words, see for instance [9j[65l[75] and refer- 
ences therein. Also see [22} ITU [81] [82] for further work on skew words. 

We end this section with a simple and useful proposition which deserves to be better known. 
Its two parts were suggested several years ago to JPA in the case of binary sequences by J. 
Cassaigne and J. Berstel respectively (private communications). 

Proposition 9. Let s be a sequence taking exactly a > 2 distinct values and let p(k) be the 
number of distinct factors of length k of s. 

(i) If s is aperiodic and admits at most one left special factor of each length, then one 
has k + a — 1 < p(k) < (a — l)k + 1 for all k > 1. In particular an aperiodic binary 
sequence which has at most one left special factor of each length is Sturmian. 

(ii) If there exists ko > 1 such that p{k) = k + a — 1 for all k > k$, then p{k) = k + a — 1 
for all k > 1. In particular if a binary sequence satisfies p{k) = k + 1 for all k larger 
than some ko, then it is Sturmian. 

Proof. 

(i). Using part (iv) of Lemma we have p(k) > k + a — 1 for all k > 1, since s is aperiodic. 
On the other hand, erasing the first letter of all factors of s of length k + 1 gives all factors of 
length k. There is at most one of these factors of length k which can be obtained from distinct 
factors of length k + 1 (since s admits at most one left special factor of length k), and if so 
there can be at most a such distinct factors of length k + 1 (since a left special factor can be 
extended on the left by at most a letters). Hence p(k + 1) — p(k) < a — 1 for all k > 1. By 
telescopic summation, this implies p(k) < (a — !)(£; — 1) +p(l) = (a — l)(k — l) + a = ak — k + l. 
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(ii). Let k\ be the least integer > 1 such that for all k > k±, one has p(k) = k+a—1. Suppose 
that ki > 1, and let I := k\ - 1. Then p(£) + But p(l) < p(k\) = k 1 + a-l = £ + a. 

Hence either p(£) = £ + a, or p(£) < £ + a — 2. In either case s would be ultimately periodic 
(by Lemma[3](i), resp. by part (iv) of Lemma [3]) , a contradiction. Hence k± = 1 and the claim 
about Sturmianicity follows from Theorem UJ □ 

3.3. Episturmian sequences. It is well known that the set of factors of any Sturmian 
sequence is closed under reversal, i.e., if u is a factor of a Sturmian sequence s, then its 
reversal u is also a factor of s (e.g., see [67] or |65} Proposition 2.1.19]). In fact: 

Theorem 10. An aperiodic binary sequence s is Sturmian if and only ifF(s) is closed under 
reversal and s admits exactly one left special factor of each length. 

Proof. Let s be an aperiodic binary sequence. First suppose that s is Sturmian. For a proof 
of the fact that F(s) is closed under reversal, see [67] or \65\ Proposition 2.1.19]. Now we will 
show that s has exactly one left special factor of each length. 

Let p(n) denote the number of factors of length n of s. Since F(s) is closed under reversal, 
a factor of s is left special (resp. right special) if and only if its reversal is right special 
(resp. left special). Hence, for all n > 1, the difference p{n + 1) — p(n) is equal to the number 
of left special factors of s of length n. Therefore, since p(n + 1) — p(n) = 1 for all n > 1 (by 
Theorem S]), s admits exactly one left special factor (or equivalently, right special factor) of 
each length. 

The converse follows immediately from part (i) of Proposition [9) □ 

Inspired by results of this flavour, Droubay, Justin, and Pirillo [261 EQ] introduced the 
following natural generalization of Sturmian sequences on an arbitrary finite alphabet A. 

Definition 11. |26] An infinite word t G A u is said to be episturmian if its set of factors 
F(t) is closed under reversal and t admits at most one left special factor (or equivalently, 
right special factor) of each length. 

Note. When A is a 2-letter alphabet, this definition gives the Sturmian words as well as the 
periodic balanced words. 

In the seminal paper [26], episturmian words were defined as an extension of standard 
episturmian words, which were first introduced as a generalization of characteristic Sturmian 
words using iterated palindromic closure (a construction due to de Luca |25|). 

The palindromic right-closure of a finite word w is the (unique) shortest palindrome 
beginning with w (see |25j). More precisely, if w = uv where v is the longest palindromic 
suffix of w, then = uvu. For example, (tie)( + ) = tie it. The iterated palindromic 

closure function [39], denoted by Pal, is defined recursively as follows. Set Pal{e) = e and, 
for any word w and letter x, define Pal(wx) := (Pal(w)x)( + \ For instance, Pal(abc) = 
(Pal(ab)c)^ = (abac)( + ^ = abacaba. Note that Pal is injective; and moreover, it is clear 
from the definition that Pal(w) is a prefix of Pal{wx) for any word w and letter x. Hence, if 
v is a prefix of w, then Pal(v) is a prefix of Pal{w). 
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Theorem 12. [26] For an infinite word s 6 A u , the following properties are equivalent. 




(i) There exists an infinite word A = xiX2^3 • ■ ■ (xi G A), called the directive word of s, 



(ii) F{s) is closed under reversal and all of the left special factors of s are prefixes of it. 
An infinite word s satisfying the above properties is said to be standard episturmian (or 
epistandard for short). 

The above characterization of epistandard words extends to the case of an arbitrary finite 
alphabet a construction given in [25] for all characteristic Sturmian words. 

Example 13. The epistandard word r directed by A = (abc)^ is known as the Tribonacci 
word; it begins in the following way: 



where each palindromic prefix Pal{x\ ■ ■ ■ x n -\) is followed by an underlined letter x n . More 
generally, for k > 2, the k-bonacci word is the epistandard word over {a±, 02, ■ ■ ■ , a/c} directed 
by (aia 2 • • • a k ) w . 

Remark 14. In [26], Droubay et al. proved that an infinite word t is episturmian if and only 
if F(t) = F(s) for some epistandard word s. They also proved that episturmian words are 
uniformly recurrent; hence any such infinite word is either (purely) periodic or aperiodic. The 
aperiodic episturmian words are precisely the episturmian words that admit exactly one left 
special factor of each length. In fact, an epistandard word s (and hence any episturmian word 
with the same set of factors s) is periodic if and only if exactly one letter occurs infinitely 
often in the directive word of s (see [501 Proposition 2.9]). 

The notion of a directive word (as defined for epistandard words in Theorem [T2|) extends 
to all episturmian words with respect to episturmian morphisms, which play a central role 
in the study of these words. Introduced first as a generalization of Sturmian morphisms, 
Justin and Pirillo [50] showed that episturmian morphisms are exactly the morphisms that 
preserve the aperiodic episturmian words (i.e., the morphisms that map aperiodic episturmian 
words onto aperiodic episturmian words). Such morphisms naturally generalize to any finite 
alphabet the Sturmian morphisms on two letters. A morphism ip is said to be Sturmian if f(s) 
is Sturmian for any Sturmian word s. The set of Sturmian morphisms over {a, b} is closed 
under composition, and consequently it is a submonoid of the endomorphisms of {a, b}* . 
Moreover, it is well known that the monoid of Sturmian morphisms is generated by the three 
morphisms: (a 1— > ab, b 1— > a), (oh ba.b \— > a), (a 1— > b, b 1— > a) and that Sturmian morphisms 
are precisely the morphisms that map Sturmian words onto Sturmian words (see \19\ [68] ; also 
see Section [52] later) . 

By definition (see [26} I50j). the monoid of all episturmian morphisms is generated, under 
composition, by all the morphisms: 

• il) a : ip a (a) = a, ip a (x) = ax for any letter i/a; 

• f/) a : Y>a(G) = a i 4>a{x) = xa for any letter i/a; 

• 9 a i,: exchange of letters a and b. 





r = abacabaabacababacabaabacabacabaabaca ■ ■ ■ , 



s 
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Moreover, the monoid of so-called epistandard morphisms is generated by all the tp a and 
the 8 a b, and the monoid of pure episturmian morphisms (resp. pure epistandard morphisms) 
is generated by the ip a and tp a only (resp. the ip a only). The monoid of the permutation 
morphisms (i.e., the morphisms (p such that <p(A) = A) is generated by all the 9 a b. 

As shown in [50J , any episturmian word is the image of another episturmian word by some 
pure episturmian morphism and any episturmian word can be infinitely decomposed over 
the set of pure episturmian morphisms. This last property allows an episturmian word to be 
defined by one of its morphic decompositions or, equivalently, by a certain spinned directive 
word, which is an infinite sequence of rules for decomposing the given episturmian word by 
morphisms. See |41} [52] for recent work concerning directive words of episturmian words. 

Remark 15. The shift-orbit of an infinite word x G A u is the set 0(x) = {T l (x), i > 0} 
and its closure is given by 0(x) = jy G A u , Pref(y) C \J i>0 Pref(T l (a:)) |, where Pref(w) 
denotes the set of prefixes of a finite or infinite word w. Note that for any infinite word t and 
x G 0(t), F(x) C F(t). If, moreover, t is uniformly recurrent, then it follows that for each 
n > 1, F n (x) = F n (t), and hence F[x) = F(t) for any x G 0{t) (see for instance [75 1 Propo- 
sition 5.1.10] or [651 Proposition 1.5.9]). This implies that 0{x) = 0(t) for any x G 0{i); 
in other words, 0(t) is a minimal dynamical system (see, e.g., [65|, I75|). Accordingly, since 
episturmian words are uniformly recurrent, the closure of the shift-orbit of any episturmian t 
is a minimal dynamical system; in particular, 0(t) consists of all the episturmian words with 
the same set of factors as t (see, e.g., [75]). 

Note that if t is aperiodic, then 0{t) contains a unique epistandard word with the same 
set of factors as t, whereas if t is periodic, 0(t) contains two different epistandard words (see 
for instance [HU HI] ) . 

3.3.1. Strict episturmian words. 

Definition 16. An epistandard word s (or any episturmian word with the same set of factors 
as s) is said to be strict if every letter in the alphabet of s occurs infinitely often in its directive 
word. 

Strict episturmian words on k letters are often said to be k- strict; these words have (k— l)n+ 
1 distinct factors of length n for all n > 1 (as proven in [261 p. 549]) and they coincide with 
the A;- letter Arnoux-Rauzy sequences introduced in [15]. In particular, the 2-strict episturmian 
words are exactly the Sturmian words since these words have n + 1 distinct factors of length 
n for each n > 1 (recall Theorem 2]). 

Note that any episturmian word takes the form ip(t) with tp an episturmian morphism 
and t an Arnoux-Rauzy sequence (or strict episturmian word). In this sense, episturmian 
words are only a slight generalization of Arnoux-Rauzy sequences. For example, the family 
of episturmian words on three letters {a, b, c} consists of the Arnoux-Rauzy sequences over 
{a,b,c}, the Sturmian words over {a, b}, {b, c}, {a, c} and their images under episturmian 
morphisms on {a, b, c}, and periodic infinite words of the form ip{x) ul where <p is an episturmian 
morphism on {a, b, c} and x G {a, b, c}. 



EXTREMAL PROPERTIES OF (EPI)STURMIAN SEQUENCES AND DISTRIBUTION MODULO 1 9 



3.3.2. Episkew words. A finite word w is said to be finite Sturmian (resp. finite episturmian) 
if w is a factor of some infinite Sturmian (resp. episturmian) word. 

Recall from Section [3.21 that skew words are ultimately periodic (but not periodic) infinite 
words, all of whose factors are finite Sturmian (or equivalently, balanced). Over a 2-letter 
alphabet, skew words constitute the family of non-recurrent balanced infinite words, whereas 
the recurrent balanced infinite words consist of the Sturmian words and the periodic balanced 
words. 

Inspired by Morse and Hedlund's [70] skew words, episkew words were recently defined in 
|40] as non-recurrent infinite words, all of whose factors are finite episturmian. A number of 
equivalent definitions of such words were given in [30] (also see Theorem 12 1\ to follow). 

Episkew words were first alluded to (but not explicated) in [37]. Following that paper, 
these words showed up again in the study of inequalities characterizing finite and infinite 
episturmian words in relation to lexicographic orderings [40] : in fact, as detailed in Section 
15.11 episturmian words have extremal properties similar to those of Sturmian words. 

To learn more about episturmian and episkew words, see for instance the recent surveys 
[HI [39]. 

4. Extremal words 

Suppose the alphabet A is totally ordered by the relation <. Then we can totally order A + 
by the lexicographic order <, defined as follows. Given two non-empty finite words u, v on A, 
we have u < v if and only if either u is a prefix of v (with u ^ v) or u = xau' and v = xbv' , for 
some finite words x, u' , v' and letters a, b with a < b. This is the usual alphabetic ordering in 
a dictionary, and we say that u is lexicographically less than v. This notion naturally extends 
to infinite words, as follows. Let u = U0U1U2 ■ ■ ■ and v = vqV\V2 • • • , where Uj, Vj G A. We 
define u < v if there exists an index i > such that Uj = Vj for all j = 0, 1 and 

Ui < Vi. 

Let w be a finite or infinite word on A, and let A: be a positive integer. We let min(u;|£;) 
(resp. m&x(w\k)) denote the lexicographically smallest (resp. greatest) factor of w of length 
k for the given order (where \w\ > k if w is finite). 

If w is infinite, then it is clear that mva{w\k) and max(u>|fc) are prefixes of the respective 
words m.m.{w\k + 1) and max(w|A; + 1). So we can define, by taking limits, the following two 
infinite words (see [73]): 

min(u>) = lim m.m.(w\k) and max(w) = lim max^A;). 

That is, to any infinite word t we can associate two infinite words min(t) and max(t) such that 
any prefix of min(t) (resp. max(t)) is the lexicographically smallest (resp. greatest) amongst 
the factors of t of the same length. 

For a finite word id on a totally ordered alphabet A, min(u;) denotes mm(w\k) where 
k is maximal such that all min(w\j), j = 1,2,... , k, are prefixes of mm(w\k). In the case 
A = {a, b}, max(j«) is defined similarly (see [4"0]). 
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The following definition, given in [40], will be useful in the next section, where we survey re- 
cent work concerning extremal properties of (epi)Sturmian sequences, particularly inequalities 
characterizing such words (finite and infinite). 

Definition 17. An acceptable pair for an alphabet A is a pair (a, <) where a is a letter in A 
and < is a total order on A such that a = min(*4). 

5. Extremal properties 

In 2003, Pirillo [72] (also see |73| ) proved that, for infinite words s on a 2-letter alphabet 
{a, b} with a < 6, the inequality 

(1) as < min(s) < max(s) < bs 

characterizes the characteristic Sturmian words and characteristic periodic balanced words. 

Remark 18. Characteristic periodic balanced sequences, which correspond to the "Sturmian" 
sequences with rational slope a > and intercept p = a (see Theorem [4] and Remark [6]) 
are precisely the sequences of the form (Pal(v)xy) u> where v S {a, b}* and {x,y} = {a,b} 
(see for instance [El [17] I26j). Also note that if s is a characteristic Sturmian sequence, then 
as = min(s) and bs = min(s). On the other hand, if s is a characteristic periodic balanced 
sequence, then either: 

• as < min(s) and bs = max(s) when s takes the form {Pal{v)ab) u , 

• or as = min(s) and max(s) < bs when s takes the form (Pal(v)ba) u . 

For example, the characteristic periodic balanced sequence s := (Palfafyab) 10 = (abaab)^ 
satisfies 

as = a(abaab) UJ < min(s) = (aabab) 1 ^ and bs = b(abaab) w = max(s), 

whereas s' := (Pal(ab)ba) w = (ababa) w satisfies 

as' = a{ababa) u) = min(s') and max(s') = (babaa) 1 ^ < bs' = b(ababa) u . 

More generally, given two characteristic periodic balanced sequences s, s' of the form s = 
{Pal{v)ab) u and s' = (Pal^ba)^ for some v £ {a, 6}*, we have 

min(s) = min(s') = (aPal(v)b) w and max(s) = max(s') = {bPal{v )a) UJ . 

See [73] for more details. 

The preceding result of Pirillo concerning characteristic Sturmian words and characteristic 
periodic balanced words (property ([I])) encompasses Theorem [2]- one of the key properties 
underlying the main theorem in Bugeaud and Dubickas' paper [21]. In fact, as mentioned 
previously, Theorem [2] was known much earlier - in 1993, Berstel and Seebold [19j (as well 
as Borel and Laubie [20]) proved one direction of the theorem, namely that characteristic 
Sturmian words satisfy ([1]). This Sturmian extremal property also resurfaced in 2001, under a 
different guise, in a paper of S. Gan [35]. However, it seems that P. Veerman [Mj was actually 
the first to prove ([1]) for Sturmian sequences in 1987, albeit from a symbolic dynamical per- 
spective and in an implicit way. A year prior, Veerman had already proved that characteristic 
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Sturmian sequences have the above extremal property [831 Theorem 2]; it was not until [841 
Theorem 2.1] that he proved the equivalence. Motivated by the combinatorics of the Man- 
delbrot set, Bullett and Sentenac [22] reproved these results of Veerman, in the language of 
ordered sets. 

In this section, we shall first discuss the combinatorial work of Pirillo and others in relation 
to the inequalities ([1]) and their generalizations. Following this, we will consider in more detail 
the earlier work by Berstel and Seebold [19] , Gan [35] , and Veerman |83[ [84"] . 

5.1. Pirillo's work continued. Continuing his work in relation to the inequalities (pQ), Pirillo 
[73] proved further that, in the case of an arbitrary finite alphabet ^4, an infinite word s on 
A is epistandard if and only if, for any acceptable pair (a, <), we have 

(2) as < min(s). 

Moreover, s is a strict epistandard word if and only if ([2]) holds with strict equality for any 
order [51~] . 

In a similar spirit, Pirillo [74] defined fine words over two letters; that is, an infinite word t 
over a 2-letter alphabet {a, b} (a < b) is said to be fine if (min(i), max(i)) = (as, bs) for some 
infinite word s. These infinite words were characterized in [73] by showing that fine words on 
{a, 6} are exactly the Sturmian and skew infinite words (see Section [3. 2p . Specifically: 

Theorem 19. Let t be an infinite word over {a, b}. The following properties are equivalent: 

(i) t is fine, 

(ii) either t is Sturmian, or t is an ultimately periodic (but not periodic) shift of an infinite 
word of the form [i(x^yx^) for some t € N, where n is a pure standard morphism on 
{a, b} and {x,y} = {a, b} (these words are the skew words). 

In other words, a fine word over two letters is either a Sturmian word or an ultimately 
periodic (but not periodic) infinite word, all of whose factors are Sturmian. 

Pirillo [74] remarked that perhaps his characterization of fine words could be generalized 
to an arbitrary finite alphabet; indeed, Glen [37] soon generalized this result by extending 
Pirillo's definition of fine words to more than two letters. That is: 

Definition 20. [37J An infinite word t on A is said to be fine if there exists an infinite word 
s such that min(i) = as for any acceptable pair (a, <). 

Note. It is easy to see that Pirillo's original 2-letter definition of a fine word is a special 
instance of the above definition. Certainly, as there are only two lexicographic orders on 
words over a 2-letter alphabet, it follows from Definition [20] that a fine word t over {a, 6} 
(a < b) satisfies (min(t), max(t)) = (as, bs) for some infinite word s. 

Glen [37] characterized these generalized fine words (given in Definition [201) by showing 
that such an infinite word is either a strict episturmian word or a strict episkew word. More 
precisely: 
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Theorem 21. [37] Let t be an infinite word with Alph(t) = A. Then, t is fine if and only if 
one of the following holds: 

(i) t is an A-strict episturmian word; 

(ii) t is non-recurrent and takes the form fj,(xs) where x is a letter, s is a strict epistandard 
word on A \ {x}, and fi is a pure episturmian morphism on A. 

Remark 22. Note that part (ii) of Theorem [21] gives the form of so-called strict episkew 
words; it is slightly simpler to what was originally given in [30] , thanks to Richomme (private 
communication). Also note that strict episkew words on a 2-letter alphabet are precisely the 
skew words (see [39]). One can also compare Theorem 1211 with Theorem 1191 A simple example 
of an episkew word is cf := cabaababaaba . . ., where f is the Fibonacci sequence on {a, b}. 

Example 23. [37] Let A = {a, b, c} with a < b < c. Let / denote the infinite Fibonacci word 
over {a, b}, i.e., the epistandard word directed by (ab)^ . Then, the following infinite words 
are fine. 

• f = abaababaabaaba ■ ■ ■ 

• cf = cabaababaabaaba ■ ■ ■ 

• f& c f = aabacabaababaabaaba ■ ■ ■ 

• ipa{f) = aabaaabaabaaabaaaba ■ ■ ■ 

• i^c{cf) = ccacbcacacbcacbcacacbcacacbca ■ ■ ■ 

• ^cificf) = cacacbcaccacbcacacbcacbcacacbcaca ■ ■ ■ 

Let us note, for example, that ip c {f) is not fine since it is a non-strict epistandard word. 
That is, ipc(f) is an epistandard word with directive word c{ab) u) , so it is not strict, nor does 
it take the second form given in Theorem 1211 

Continuing this work, Glen, Justin, and Pirillo [3D] recently proved new characterizations 
of finite Sturmian and episturmian words via lexicographic orderings. As a consequence, 
they were able to characterize by lexicographic order all episturmian words in a wide sense 
(episturmian and episkew infinite words). Similarly, they characterized by lexicographic order 
all balanced infinite words on a 2-letter alphabet; in other words, all Sturmian, periodic 
balanced, and skew infinite words, the factors of which are (finite) Sturmian. 

In the finite case: 

Theorem 24. |40j A finite word w on A is episturmian if and only if there exists a finite 
word u on A such that, for any acceptable pair (o, <), we have 

(3) a«|m|-i < m 

where m = mm(w) for the considered order. □ 

A corollary of Theorem [23] is the following new characterization of finite Sturmian words 
(i.e., finite balanced words). 

Corollary 25. |4D] A finite word w on A = {a, b}, a < b, is not Sturmian (in other words, 
not balanced) if and only if there exists a finite word u G {o, b}* such that aua is a prefix of 
min(w) and bub is a prefix o/max(w). □ 
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In the infinite case, a characterization of episturmian words in the wide sense follows almost 
immediately from Theorem [241 That is: 

Corollary 26. [50] An infinite wordt on A is episturmian in the wide sense (i.e., episturmian 
or episkew) if and only if there exists an infinite word u on A such that 

au < min(t) 

for any acceptable pair (a, <). 

Consequently, an infinite word s on {a, b} (a < b) is balanced (i.e., Sturmian, periodic 
balanced, or skew) if and only if there exists an infinite word u on {a, 6} such that 

(4) au < min(s) < max(s) < bu. 

For any sequence s, max(s) is the same as sup{T k (s),k > 0}, and similarly min(s) = 
inf{T k (s),k > 0}, where the infimum and supremum are taken with respect to the lexico- 
graphic order. The preceding result therefore shows that a sequence s in {0, 1} W is balanced 
if and only if there exists a sequence u S {0, 1} W such that Ou < T k (s) < lu for all k > 0. In 
particular, a sequence s on {0, 1} being Sturmian is equivalent to s being aperiodic and the 
existence of a sequence u on {0, 1} such that Ou < T k (s) < lu. Moreover, it follows from the 
proof of Theorem \W\ (or Theorem 12 1|) that u is the unique characteristic Sturmian sequence 
having the same slope as s. This is exactly Theorem 2.1. For the sake of completeness, we 
give a direct proof below. 

Direct proof of Theorem^ Let s be an aperiodic sequence on {0, 1}. First suppose that s 
is a Sturmian sequence. Since it contains both 0's and l's, there exist two binary sequences 
x and y such that Ox : = inf{T fc (s), k > 0} and ly := sup{T fc (s), k > 0}. We claim that 
x > y. Namely, if x < y, there exist a (possibly empty) word w and two infinite sequences 
x' and y' such that x = wOx' and y = wly'. Hence Cte = 0w0x' and ly = lwly' . Since any 
factor of inf{T fc (s), k > 0} and of sup{T fc (s), k > 0} is a factor of s, we have that both OwO 
and are factors of s. Hence s is unbalanced (see Definition [7] and the comments following 
it), but is was supposed Sturmian, a contradiction (Theorem [8]). Thus x > y, and hence 

V/c > 0, Ox < T k (s) <ly< lx. 

Now suppose that s has the property that there exists a binary sequence u such that 

(5) Mk > 0, Ou < T k (s) < lu. 

Let z be a left special factor (if any) of s, and let z' be the prefix of u that has the same 
length as z. Since Oz and Iz are both factors of s, there exist two integers l\ and £2 such that 
T^ x (s) begins with Oz and T^ 2 (s) begins with Iz. We deduce from the inequalities © with 
k = l\ (resp. £2) that 

0z' < Oz and Iz < Iz'. 

This implies 

z' < z and z < z 



14 



JEAN-PAUL ALLOUCHE AND AMY GLEN 



hence z = z'. Thus s has at most one left special factor of each length. Hence s is Sturmian 
(Proposition [9]), and its left special factors are exactly the prefixes of u. 

This implies furthermore that u belongs to the closure of the shift-orbit of s, hence it is 
Sturmian. But the prefixes of Ou and lu are also factors of s. Hence Ou and lu are also in the 
closure of the shift-orbit of s, thus Sturmian. This implies that u is Sturmian characteristic 
(see, e.g., [65, Proposition 2.1.22]). Thus u is the (unique) characteristic Sturmian sequence 
having the same slope as s. □ 

Remark 27. We noted in the Introduction that Theorem [2] can be easily deduced from The- 
orem [TJ Actually Theorem [T] can also be deduced from Theorem [5} it suffices to remember 
that the closure of the shift-orbit of a characteristic Sturmian sequence u is exactly the set of 
all Sturmian sequences having the same slope as u (see for instance |65[ Proposition 2.1.25]), 
and all of these Sturmian sequences have the same set of factors ( |65[ Proposition 2.1.18], or 
[67]). See also Remark [331 later. 

Recently, Richomme |77] proved that episturmian words can be characterized via a nice 
"local balance property". That is: 

Theorem 28. [77] For a recurrent infinite word t G A^ , the following assertions are equiva- 
lent: 

(i) t is episturmian; 

(ii) for each factor u of t, there exists a letter a such that AuA n F(t) C auA U Aua; 
(hi) for each palindromic factor u of t, there exists a letter a such that AuA n F(t) C 

auA U Aua. 

Roughly speaking, the above theorem says that for any factor u of a given episturmian 
word t, there exists a unique letter a such that every occurrence of u in t is immediately 
preceded or followed by a in t. When \A\ = 2, property (ii) of Theorem [28] is equivalent to 
the definition of balance. Indeed, Coven and Hedlund [24j stated that an infinite word s over 
{a, b} is not balanced if and only if there exists a palindrome u such that aua and bub are 
both factors of s. As pointed out in [77] , this property can be rephrased as follows: an infinite 
word s is Sturmian if and only if s is aperiodic and, for any factor « of s, the set of factors 
belonging to AuA is a subset of auA U Aua or a subset of buA U Aub. 

Remark 29. Recall that the set of all infinite words in A w having episturmian factors consists 
of the (recurrent) episturmian words and the (non-recurrent) episkew words in A u . There- 
fore, since properties (ii) and (iii) in Theorem [28] concern only factors, one readily deduces 
that these properties in fact characterize the episturmian and episkew words in A u . So the 
hypothesis of recurrence in the statement of the theorem restricts attention to episturmian 
words only. 

We will now use Theorem [28] to give an alternative (simpler) proof the following analogue 
of Theorem 2.1 for episturmian sequences, which was originally proved in [38] (also see [40] ) . 
This result, in particular, gives a more precise version of Corollary [26] under the hypothesis 
of recurrence. 



EXTREMAL PROPERTIES OF (EPI)STURMIAN SEQUENCES AND DISTRIBUTION MODULO 1 15 



Theorem 30. A recurrent infinite word t on A is episturmian if and only if there exists an 
infinite word u on A such that, for any acceptable pair (a, <), 

aw<T*(t) foralli>0. 

Moreover, ift is aperiodic, then u is the unique epistandard word with the same set of factors 
as t (i.e., the unique epistandard word in the closure of the shift-orbit of t), and for any 
acceptable pair (a, <), au = inf{T (t), k > 0} if and only if the letter a occurs infinitely 
often in the directive word of u. 

Proof. Let t be a recurrent infinite word on A. 

First suppose that t is episturmian. Let x be a letter in A and consider two different total 
orders <i and <2 on A such that (x, <i) and (x, < 2 ) are acceptable pairs. Then there exist 
infinite words u and v on A such that 

(6) xu = infi{T fc (t), k > 0} for the total order <i on A, 
and 

(7) xv = inf2{r fc (t), k > 0} for the total order < 2 on A. 

(Here, infj denotes the infimum with respect to the order <j for i = 1, 2.) We will show that 
u = v. By equations ([6]) and ([7]), we have 

xu <i xv and xv <2 xu. 

Hence, if u and v are prefixes of the respective words u and v with \u\ = \v\, then we have 
u <i v and v <2 u. This implies that u = v , and therefore u = v. Hence, for a given letter x 
in A, there exists a unique infinite word Mon^l such that 

(8) xu = mi x {T k (t), k > 0} for any acceptable pair (x, < x ). 

Now consider another letter y in A \ {x}. By what precedes, we know there exists a unique 
infinite word v on A such that 

(9) yv = mly{T k {t), k > 0} for any acceptable pair (y, < y ). 

Again, we will show that u = v. Suppose not. Then there exist a (possibly empty) word w 
and two infinite words u' and v' over A such that u = wz\u' and v = wz2v' for some letters 
z\ and Z2 with z\ ^ Z2- Hence xu = xwz\U ! and yv = ywz2v' , and therefore the words xwz\ 
and ywz2 are both factors of t, since any factor of xu and of yv is also a factor of t (by ([8]) 
and ([9])). But then, by Richomme's local balance property (Theorem [28]) , Z2 = x or z\ = y. 

If Z2 = x, then for any acceptable pair (x, < x ), we have x < x z\ (since z\ ^ Z2), and hence 
xv (= xwxv') < x xu (= xwz±u'), contradicting the (lexicographical) minimality of u with 
respect to the total order < x . Likewise, if z\ = y, then for any acceptable pair (y,< y ), we 
have y < y Z2 (since z\ 7^ Z2), and hence yu (= ywz\u') < y yv (= ywz2v'), a contradiction. 
Thus u = v. 

Hence, there exists a (unique) infinite word u on A such that, for any acceptable pair 
(a, <), au < r*(t) for all i > 0. 
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Conversely, suppose there exists an infinite word u on A such that, for any acceptable pair 
(a, <), we have 

(10) au < T l {t) for all % > 0. 

Let z be a left special factor (if any) of i, and let z' denote the prefix of u with \z'\ = \z\. 
Since z is left special in i, there exist at least two distinct letters x, y such that xz and yz are 
both factors of t. In particular, there exist non-negative integers l\ and £2 such that T £l (t) 
begins with xz and T i2 (t) begins with yz. Thus, by inequality (|10p . we have 

xz < x xz for any acceptable pair (x, < x ), 

and 

yz < y yz for any acceptable pair (y, < y ). 

Hence z' < x z and z' < y z, and this implies that z = z' . Therefore t has at most one left 
special factor of each length and the left special factors of t are exactly the prefixes of u. 
Thus F(u) C F(t); in particular, u is in the closure of the shift-orbit of t. 

Now suppose that t is not episturmian. Then, by Theorem [28l there exists a word w 
(possibly empty) and letters a, b, c, and d with {a, b} n {c, d} = such that au;6 and cwd are 
both factors of t. Since a/c, the word w is a left special factor of t, and therefore ui is a 
prefix of it. 

Let l\ and ^2 be non-negative integers such that T^ 1 (£) begins with awb and T^ 2 (t) begins 
with cwd. Then, for any two acceptable pairs (a, < a ) and (c, < c ), we have 

(11) au (=awz---) < a T h {t) (=awb---), 
and 

(12) cu (=cwz---) < c T i2 (t) (=cwd---). 

Inequality (jlip implies that z < a b, whereas inequality (|12p implies that z < c d, and moreover 
z < c b and z < a d. These inequalities imply that z = b = d, a contradiction. 

Hence t is episturmian, and therefore u is episturmian too (since u is in the closure of the 
shift-orbit of t, which consists of all episturmian words with the same set of factors as t - see 
Remark 1151 or [39]). Moreover, u is epistandard since all of its left special factors are prefixes 
of it. Therefore, for any letter x in .A, xu is episturmian if and only if x occurs infinitely often 
in the directive word of u (see |5Ul Theorem 3.17], |38| Theorem 2.6], or \J7\ Theorem 6]). 
Hence, for any acceptable pair (a, <), au = inf{T fc (t), k > 0} if and only if the letter a occurs 
infinitely often in the directive word of u. □ 

Remark 31. An unrelated connection between finite balanced words (i.e., finite Sturmian 
words) and lexicographic ordering was recently studied by Jenkinson and Zamboni [38], who 
presented three new characterizations of "cyclically" balanced finite words via orderings. 
Their characterizations are based on the ordering of shift-orbits, either lexicographically or 
with respect to the 1-norm | • |i, which counts the number of occurrences of the symbol 1 in 
a given finite word over {0, 1}. 
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5.2. Sturmian morphisms. Prior to the recent work of Pirillo and others, the extremal 
property ([I]) was shown to hold for characteristic Sturmian sequences in a paper by Berstel 
and Seebold [19] . Here is a reformulation of their result (recalling the definition of s a ^ p from 
Section [3T2l and letting c a := s a ^ a = s' a a denote the unique characteristic Sturmian sequence 
of slope a): 

Proposition 32. [191 Property 7] Let a > be an irrational number. Then, for all i > 1, we 

have 

ac a < T' l (ac a ) and bc a > T l (bc a ). 
In particular, for all i > 0, we have 

ac a < T l (c a ) < bc a . 

Remark 33. Recall from Remark [15] that the closure of the shift-orbit of any Sturmian word s 
is a minimal dynamical system consisting of all the Sturmian words with the same set of factors 
as s (also see [651 Proposition 2.1.25]). In particular, if s is a Sturmian word with (irrational) 
slope a, then O(s) consists of all Sturmian words of slope a (e.g., see [65] Propositions 2.1.18] 
or [67] )• Accordingly, the second part of Proposition [32] (also see Theorems Q] and [2]) tells 
us that ac a and bc a are the lexicographically least and greatest Sturmian words of slope a, 
respectively. 

Proposition 1321 was also proved by Borel and Laubie [20] in the same year (1993). In (19,, 
Berstel and Seebold showed that it is an easy consequence of the following more general result. 

Proposition 34. Let a > be an irrational number and let p, p' be real numbers such that 
< p, p' < 1. Then 

Sa,p < 8 a>p > <==^ p < p'. 

The above proposition was one of numerous results in [T9] leading to the proof of a now 
well-known characterization of Sturmian morphisms, i.e., morphisms that preserve Sturmian 
words. Specifically, a morphism on {a, 6} is Sturmian if and only if it can be expressed as a 
finite composition of the following morphisms, in any number and order: 

_, a i— > b a ^ ab _ a i— > ba 

E: , tp : (p : 

pi— > a o i— * a o i— > a 

(Note that (p = ip a 8ab and (p = tp a ^ab'i see Section [3~3l ) 

This result played a particularly important role in Berstel and Seebold's characterization 
of morphisms that preserve characteristic Sturmian words - the so-called characteristic or 
standard (Sturmian) morphisms. That is, a morphism on {a, b} is standard if and only if it 
is expressible as a finite composition of the morphisms E and (p in any number and order 
[19] . The fact that there is no occurrence of the morphism (p in such a composition is due to 
Proposition [32] 
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5.3. The lexicographic world. As mentioned previously, a disguised form of Theorem [2] 
(also see ([I])) appeared in S. Gan's paper [35]; in fact, as we shall see, Theorem [T] can be 
deduced from the main results in [35] ■ Gan came across this property of Sturmian sequences 
whilst endeavouring to obtain a complete description of the lexicographic world, defined as 
follows. 

For any two infinite words x, y G {0, 1} W , define the set 

S xy := {s G {0, IY, Mi > 0, x < T(s) < y}. 
The lexicographic world C is defined by 

C : = {(x,y) G {0, 1} W x {0, 1}", £ xy jt 0}. 
Gan proved in [35 \ Lemma 2.1] that 

c = {(u, V ) e {o, ir x {o, ir, « > 0(«)}, 

where : {0, 1} W — ► {0, l} 1 ^ be the map defined by 

<j>{x) : = hif{y G {0, ^ / 0}. 

As Gan points out in that paper, the set C is closely related to the bifurcation of a Lorenz-like 
map (sec [63j for example). 

The following theorem combines Corollary 5.6 and Theorem 5.7 from Gan's paper [35] 
(also see Theorem 1.1 in the same paper). It shows in particular that any element in the 
image of (j) is a Sturmian or periodic balanced sequence in {0, 1} W (and such sequences are 
the lexicographically greatest amongst their shifts). 

Theorem 35. For any sequence s G {0, 1} W , the following conditions are equivalent. 

(i) s = 4>(x) for some sequence x G {0, 1} U ■ 

(ii) s is a Sturmian or periodic balanced sequence satisfying T l {s) < s for all i > 0. 

Moreover, if x begins with 1, then <j)(x) = l u , and if x = Ou for some u G {0, 1}", then 4>(x) 
is the unique Sturmian or periodic balanced sequence s in {0, 1} U satisfying Ou < T l (s) < lu 
and T*(s) < s for all i > 0. 

In the process of establishing Theorem [35l Gan also proved the following description of 
Sturmian minimal sets (see [13] for a definition; also note that minimal sets correspond to 
minimal dynamical systems). 

Theorem 36. [35] A minimal set M is a Sturmian minimal set if and only if M C [Ox, lx] := 
{y G {0, 1}^, Ox < y < lx} for some x G {0, 1} U . Moreover, for any x G {0, 1}^, there exists 
a unique Sturmian minimal set in [Ox, lx]. 

Theorem 1361 actually encompasses the first part of Theorem [2 indeed, it can be interpreted 
as follows: a uniformly recurrent sequence y G {0, l} 1 ^ satisfies Cte < T l (y) < lx for alii > 
and some binary sequence x if and only if y is a Sturmian or periodic balanced sequence. As 
discussed in Section f5.lt this result was recently rediscovered by Glen, Justin, and Pirillo [ID] 
(see flU)), but in a slightly stronger form without the uniform recurrence condition, giving 
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that y is either a Sturmian sequence, a periodic balanced sequence, or a skew sequence (i.e., 
y is a balanced sequence). 

The second part of Theorem Q] can also be deduced from Gan's work, as follows. Let u be any 
characteristic Sturmian sequence on {0, 1}. Then, by Theorem [35l the sequence s := (f)(0u) is 
the unique Sturmian sequence satisfying Ou < T l (s) < lit and T l (s) < s for all i > 0. Suppose 
x is the unique characteristic Sturmian sequence in O(s), the closure of the shift-orbit of s. 
Then Ox and lx are Sturmian sequences, by |65l Proposition 2.1.22]. Moreover, Ox and lx 
have the same set of factors as x since the prefixes of x are exactly its left special factors. 
Hence, both Ox and lx are in O(s), and therefore, since Ou < T l (s) < lu for all i > 0, we 
have Ou < Ox and lx < lu. These inequalities imply that u = x. Thus, for any characteristic 
Sturmian sequence x, we have Ox < T' l (x) < lx for all i > 0. This establishes the forward 
direction of Theorem [21 and it follows that for any Sturmian sequence s on {0, 1}, we have 
Ou < T l (s) < lu for all i > 0, where u is the unique characteristic Sturmian sequence with 
the same slope as s (recall Remark [33]) . This proves the second part of Theorem [1] and from 
this theorem one can easily deduce both directions of Theorem [2] (see Remark I27p . 

Remark 37. By Remark [33l the lexicographically greatest and least Sturmian sequences in 
the closure of the shift-orbit of a Sturmian sequence s on {0, 1} are Ou and lu where u is 
the unique characteristic Sturmian sequence with the same slope as s. We thus deduce from 
Theorems Q] and [35] that, for any sequence x on {0, 1} beginning with 0, the sequence (j)(x) 
is a Sturmian or periodic balanced sequence of the form lu. Moreover, if <j>(x) is Sturmian, 
then u is the unique characteristic Sturmian sequence with the same slope as <fi(x). 

The following lemma was a key step in Gan's proof of Theorem [36] It involves the block 
condition (BC): a sequence s G {0, 1} W satisfies the BC if, for any finite word w on {0, 1}, at 
least one of the words OwO and lwl is not a factor of s. 

Lemma 38. [35[ Lemma 4.4] A sequence s € {0, 1} W satisfies the BC if and only if there 
exists a sequence u such that Ou < T l (s) < lu for all i > 0. 

This result is essentially the characterization of balanced infinite words given in [40] (see 
dH)). Indeed, the BC is equivalent to the balance property, as defined in Definition [7] See 
Section 3 in [24] . in which the balance property is called the Sturmian block condition (see 
also [77]). Note that the BC of Coven and Hedlund [24\ Lemma 3.06 p. 143] is stronger than 
Gan's in that "for any finite word w" is replaced by "for any palindrome u>" . 

Remark 39. As explained by Labarca and Moreira in [60], the terminology "lexicographical 
world" was coined in 2000, in a preprint version of [62] (which appeared only in 2006) in which 
the authors extended the work of Hubbard and Sparrow [35]. For more on the lexicographic(al) 
world, the reader can look at, e.g., [6HI62] and the references therein. See also the recent paper 
[8] , in which the present two authors give a complete description of the lexicographic world in 
the process of describing the minimal intervals containing all fractional parts {£2 n }, for some 
positive real number £, and for all n > 0. 

5.4. The early work of Veerman: 1986 & 1987. Let S a denote the set of all Sturmian 
sequences of (irrational) slope a > over the alphabet {0, 1} (i.e., a \— ► 0, b \— > 1 in Theorem^]). 
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As noted, e.g., in [16] . each Sturmian sequence s £ S a can be viewed as the binary expansion 
of some real number r(s) modulo 1. Moreover, it is easily verified that, for any s, s' £ S a , 
we have s < s' if and only if r(s) < r(s'). Furthermore, by Remark [33| we know that 
the lexicographically least and greatest sequences in S a are 0c Q and lc a , respectively. In 
terms of binary expansions, as r(lc a ) = l/2 + r(0c a ), it follows that the set r(S a ) := {r(s) £ 
[0, 1), s £ S a } is completely contained within the closed interval [r(0c a ), r(lc a )] of length 1/2 
and not in any smaller interval. This latter result (to compare with Bugeaud-Dubickas' result 
where base 2 is replaced with base b |21j ) is essentially a reformulation of Theorem 2 p. 558 in 
Veerman's paper [83], which also states that r(S a ) is a Cantor set and that [r(0c a ), r(lc a )] has 
Lebesgue measure zero. The converse of this theorem was proved one year later by Veerman 
in [84} Theorem 2.1, p. 193-194]. As such, it seems that Veerman was the first to (implicitly) 
prove the Sturmian extremal property given in Theorem [TJ under the framework of symbolic 
dynamics. Actually, Veerman's main result in [84] shows that a sequence s in {0, 1}^ satisfies 
the inequalities Ou < T l (s) < lu for some sequence u £ {0, 1} W and for all z > if and only 
if s is a Sturmian sequence or a periodic balanced sequence (c/. (|H)). A few years earlier (in 
1984), Gambaudo et al. [33] had already proved the periodic case (i.e., the case when a is 
rational); Veerman considered his Theorem 2.1 in [83] to be a generalization of their main 
result. 

Remark 40. Note that the set r(S a ) is a dynamical system under the operation of the dou- 
bling map a : x i— > 2x (mod 1) on the one-dimensional torus T = R/Z. This was the point 
of view of Veerman and also that of Bullet and Sentenac [22J, who gave reformulations and 
self-contained combinatorial proofs of some of Veerman's results in [831 184] . In particular, 
Bullett and Sentenac gave another proof of the following result (which can be deduced from 
Veerman's work): for each closed interval = [a, 1/2 + /i] of length 1/2 (where [i £ T), 
there exists a unique a such that r(S a ) is contained in and there is no other dynami- 
cal system for the doubling map that is a strict subset of C^. This fact was recently used 
by Jenkinson [16] to prove new characterizations of Sturmian measures, which have appli- 
cations to ergodic optimization of convex functions. Another important application is in the 
combinatorial description of the Mandelbrot set (e.g., see |22} 156]). 

Remark 41. In the study of kneading sequences of Lorenz maps (i.e., a certain class of piece- 
wise monotonic maps on [0, 1] with a single discontinuity), Glendinning, Hubbard, and Spar- 
row [121 H5] have investigated so-called allowed pairs (r,s) of distinct binary sequences in 
{0, 1} W satisfying 

r < T*(r) < s and r < T*(s) < s for all i > 0. 

In particular, it was shown in [35] that these allowed pairs are exactly the pairs of (distinct) 
binary sequences in {0, 1} W that are realizable as kneading invariants of a topologically expan- 
sive Lorenz map. (Note that the case 5 = 1^ was studied by Acquier, Cosnard, and Masse in 
[T].) Moreover it can be deduced from property ([1]) that the allowed pairs of the form (Ou, lu) 
are those where u is a characteristic Sturmian sequence. 
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6. Back to distribution modulo 1: the Thue-Morse sequence shows up 

As indicated in the introduction, we began writing this survey after the publication of the 
paper of Bugeaud and Dubickas [21], whose starting point goes back to a paper of Mahler 
[66]. In that paper Mahler defines the set of Z- numbers 



where {x} is the fractional part of the real number x. Mahler proved that this set is at most 
countable. It is still an open problem to prove that this set is actually empty. More generally, 
given a real number a > 1 and an interval (s, t) C (0, 1) one can ask whether there exists 
£ > such that, for all n > 0, we have s < {£a n } < t. Flatto, Lagarias, and Pollington [33^ 
Theorem 1.4] proved that, if a = p/q with p, q coprime integers and p > q > 2, then any 
interval (s,t) such that for some £ > 0, one has that {£(p/q) n } G (s,t) for all n > 0, must 
satisfy t — s > 1/p. The main result in [21] reads as follows. 

Theorem 42 (Bugeaud-Dubickas) . Let b > 2 be an integer and let £ be an irrational number. 
Then the numbers {£6 n } cannot all lie in an interval of length < 1/6. Furthermore there exists 
a closed interval I of length 1/b containing the numbers {£,b n } for all n > if and only if the 
sequence of base b-digits of the fractional part of £ is a Sturmian sequence s on the alphabet 
{k, k + 1} for some k 6 {0, 1, . . . , b — 2}. If this is the case, then £ is transcendental, and the 
interval I is semi-open. It is open unless there exists an integer j > 1 such that T J (s) is a 
characteristic Sturmian sequence on the alphabet {k,k+ 1}. 

The reader will easily see the relation between Theorem [J2] and Theorems [1] and [2j Note 
that the first assertion in Theorem H2] is generalized to algebraic real numbers > 1 by Du- 
bickas in [28]. Also note that two other papers by Dubickas [29^ I30| deal with links between 
distribution of {£a n } modulo 1 and combinatorics on words. Furthermore the Thue-Morse 
sequence, defined as the fixed point beginning with of the morphism — > 01, 1 — > 10, shows 
up in these two papers: in [29] for the study of "small" and "large" limit points of ||£(p/Q) n ||, 
the distance to the nearest integer of the product of any non-zero real number £ by the powers 
of a rational; in [30] for the study of the "small" and "large" limit points of the sequence of 
fractional parts {£&"}, where b < —1 is a negative rational number and £ is a real number. 
For work in a similar vein and with an avatar of the Thue-Morse sequence, see |54| . 

Interestingly enough, the Thue-Morse sequence also appeared in 1983 in another question 
of distribution, as a by-product of the combinatorial study of a set of sequences related to 
iterating continuous maps of the unit interval (see [HE]). 

Theorem 43. Define the set T by 



Then the smallest limit point ofT is the number a := ^a n /2 n , where (a n ) n >o is the Thue- 
Morse sequence. The set T contains only countably many elements less than a and they are 
all rational. Furthermore any segment on the right of a contains uncountably many elements 
ofT. This structure around a repeats at infinitely many scales: V is a fractal set. 




r := {x e [0, 1], 1 - x < {2 k x} < x}. 
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The reader will have guessed that Theorem [43] above is a by-product of the combinatorial 
study of the set 

(13) r := {u £ {0, 1} N , Vfc > 0, u < T k (u) < u} 

where u is the sequence obtained by switching 0's and l's in u (see [4]). 

An avatar of the set T (where large inequalities are replaced by strict inequalities) was 
studied in [32] in the description of univoque numbers, i.e., real numbers (3 in (1, 2) such that 
there exists a unique base /3-expansion of 1 as 1 = Ylj>i Ujf3~~ J , with Uj G {0, 1}. See [7j for 
more details. 

In [5] JPA uses Theorem [1] to prove that a Sturmian sequence s on {0, 1} belongs to the 
set r (see (|13p ) if and only if there exists a characteristic Sturmian sequence u beginning 
with 1 such that s = lu. (In particular, a Sturmian sequence belonging to T must begin 
with 11.) As an immediate corollary we have that a real number (3 £ (1,2) is univoque 
and self-Sturmian (i.e., the greedy /3-expansion of 1 is a Sturmian sequence) if and only 
if the /3-expansion of 1 is of the form lu, where u is a characteristic Sturmian sequence 
beginning with 1. Self-Sturmian numbers were introduced in [23j, where it was proved that 
such numbers are transcendental (see also |59| for more on related questions). Theorem [2] 
was used in [23] and a proof of Theorem [T] was also given in a preprint version of that 
paper (see http://arxiv.org/abs/math/0308140); it was taken off the last version, as the 
author explained to JPA: first because a referee suggested it was "folklore", and second 
because actually only one direction of Theorem [2] was needed. Self-sturmian numbers have 
since been generalized to self-episturmian numbers in [38], where an analogue of Theorem [1] 
for episturmian sequences can also be found (see Theorem I30|) . 

Also note that sets related to the set T and to the lexicographic world occur in the study 
of badly approximable numbers in [71] . 

We end this section with a last remark. 

Remark 44. It is tempting to try to convert the extremal property for episturmian sequences 
given in Corollary [26] (see [30]) to a result in distribution modulo 1. From now on, < will 
denote the "usual" order on D := {0, 1, . . . , d — 1}; other orders will be denoted by -<. As we 
have seen, an infinite word t on D := {0, 1, . . . , d — 1} is episturmian in the wide sense (i.e., 
episturmian or episkew) if and only if there exists an infinite word u such that 

au < min(t) (*) 

for any acceptable pair (a, -<). Actually, replacing the "usual" order on D by another total 
order is the same as keeping the order but replacing each j in this set by <r(j), where a is 
a permutation of D. More precisely, (a, -<) is an acceptable pair if and only if there exists 
a permutation a-< of D such that cr(a) = and i ^ j <^ < cr(j). Hence, another 

way of formulating (*) above is as follows: there exists an infinite word u such that for all 
permutations a of D one has 

0cj(m) < min((r(i)) 
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where a{uQU\U2 • • •) '■= (j{uq)(j{ui)<j{u2) ■ ■ ■ (for finite or infinite words on D). Hence trans- 
lating extremal properties of episturmian sequences to properties of distribution modulo 1 for 
real numbers consists of looking at reals x in (0, 1) such that there exists a real y in (0, 1) 
with ^y a < {d k x a } for all integers k and for all permutations a (where x a is the real number 
obtained from a by applying the permutation a digitwise). If d = 2, permuting 0's and l's 
in a real number x written in base 2 is the same as replacing x by 1 — x. Hence, in that 
case, the inequalities < {2 k x a } boil down to the two families of inequalities \y < {2 k x} 
and |(1 - y) < {2 k (l - x)} = 1 - {2 k x}, i.e., \y < {2 k x} < \ + \y for all k. This is pre- 
cisely the question from which we started our paper, but for general d it does not seem that 
number-theoretists have been interested in distribution modulo 1 combined with permuting 
digits. 

7. Addendum 

While writing this survey we came across several extra references that are related to some 
of its parts. We give some of them here; the interested reader can look at these papers and 
the references therein: about combinatorics of words and Lorenz maps [TUl QZEJ CGI [El EH 
[79], about extremal properties of Sturmian sequences or measures [471 E3 EH]) about the 
distribution of {£a n } [21 EH [851 ES], an d last but not least the historical paper of Lorenz 
|BU (also see [SO]). 

8. Acknowledgements 

The authors would like to thank J. Berstel, J. Cassaigne, J. Justin, D. Kwon, G. Pirillo, 
G. Richomme, P. Seebold, and L. Q. Zamboni for discussions, comments, and suggestions. 

References 

[1] M.-H. Acquier, M. Cosnard, C. Masse, Structure de bifurcations des families a un parametre de fonctions 
croissantes par morceaux possedant une seule discontinuity, C. R. Acad. Sci. Paris, Ser. I 300 (1985), 
17-22. 

[2] S. Akiyama, Mahler's ^-number and 3/2 number systems, Unif. Distrib. Theory 3 (2008), 91-99. 

[3] S. Akiyama, C. Frougny, J. Sakarovitch, Powers of rationals modulo 1 and rational base number systems, 
Israel J. Math. 168 (2008), 53-91. 

[4] J. -P. Allouche, Theome des nombres et automates, These d'Etat, 1983, Universite Bordeaux I 
Qhttp : //tel . archives-ouvertes . f r/tel-00343206/f r/ p . 

[5] J. -P. Allouche, A note on univoque self-Sturmian numbers, Theor. Inform. Appl. 42 (2008), 659-662. 

[6] J. -P. Allouche, M. Cosnard, Iterations de fonctions unimodales et suites engendrees par automates, 
C. R. Acad. Sci. Pans, Ser. I 296 (1983), 159-162. 

[7] J. -P. Allouche, M. Cosnard, Non-integer bases, iteration of continuous real maps, and an arithmetic self- 
similar set, Acta Math. Hungar. 91 (2001), 325-332. 

[8] J. -P. Allouche, A. Glen, Distribution modulo 1 and the lexicographic world, Preprint, 2009. 

[9] J. -P. Allouche, J. Shallit, Automatic Sequences. Theory, Applications, Generalizations, Cambridge Uni- 
versity Press, UK, 2003. 

[10] LI. Alseda, A. Falco, A characterization of the kneading pair for bimodal degree one circle maps, Ann. 

Inst. Fourier, Grenoble 47 (1997), 273-304. 
[11] LI. Alseda, A. Falco, On the topological dynamics and phase-locking renormalization of Lorenz-like maps, 

Ann. Inst. Fourier, Grenoble 53 (2003), 859-883. 



24 JEAN-PAUL ALLOUCHE AND AMY GLEN 



[12] LI. Alseda, F. Manosas, Kneading theory and rotation intervals for a class of circle maps of degree one, 

Nonlmearity 3 (1990), 413-452. 
[13] LI. Alseda, F. Manosas, Kneading theory for a family of circle maps with one discontinuity, Acta Math. 

Univ. Comeman (N. S.) 65 (1996), 11-22. 
[14] V. Anagnostopoulou, O. Jenkinson, Which beta-shifts have a largest invariant measure?, J. London Math. 

Soc. 79 (2009), 445-464. 

[15] P. Arnoux, G. Rauzy, Representation geometrique de suites de complexity 2n+ 1, Bull. Soc. Math. France 
119 (1991), 199-215. 

[16] J. Berstel, Recent results on extensions of Sturmian words, Internat. J. Algebra Comput. 12 (2002), 
371-385. 

[17] J. Berstel, Sturmian and episturmian words (a survey of some recent results), in CAI 2007, Lecture Notes 

in Computer Science, vol. 4728, 2007, pp. 23-47. 
[18] J. Berstel, P. Seebold, A remark on morphic Sturmian words, Theor. Inform. Appl. 28 (1994), 255-263. 
[19] J. Berstel, P. Seebold, A characterization of Sturmian morphisms, in Borzyszkowski, A.M. and Sokolowski, 

S. (Eds.), Mathematical Foundations of Computer Science 1993, Lecture Notes in Computer Science, 

vol. 711, Springer- Verlag, Berlin, 1993, pp. 281-290. 
[20] J. -P. Borel, F. Laubie, Quelques mots sur la droite projective reelle, J. Theor. Nombres Bordeaux 5 (1993), 

23-51. 

[21] Y. Bugeaud, A. Dubickas, Fractional parts of powers and Sturmian words, C. R. Acad. Sci. Paris, Ser. I 
341 (2005), 69-74. 

[22] S. Bullett, P. Sentenac, Ordered orbits of the shift, square roots, and the devil's staircase, Math. Proc. 

Cambridge Philos. Soc. 115 (1994), 451-481. 
[23] D. P. Chi, DoYong Kwon, Sturmian words, /3-shifts, and transcendence, Theoret. Comput. Sci. 321 (2004), 

395-404. 

[24] E. M. Coven, G. A. Hedlund, Sequences with minimal block growth, Math. Systems Theory 7 (1973), 
138-153. 

[25] A. de Luca, Sturmian words: structure, combinatorics and their arithmetics, Theoret. Comput. Sci. 183 
(1997), 45-82. 

[26] X. Droubay, J. Justin, G. Pirillo, Episturmian words and some constructions of de Luca and Rauzy, 

Theoret. Comput. Sci. 255 (2001), 539-553. 
[27] X. Droubay, G. Pirillo, Palindromes and Sturmian words, Theoret. Comput. Sci. 223 (1999), 73-85. 
[28] A. Dubickas, Arithmetical properties of powers of algebraic numbers, Bull. Math. London Soc. 38 (2006), 

70-80. 

[29] A. Dubickas, On the distance from a rational power to the nearest integer, J. Number Theory 117 (2006), 
222-239. 

[30] A. Dubickas, On a sequence related to that of Thue- Morse and its applications, Discr. Math. 307 (2007), 
1082-1093. 

[31] A. Dubickas, Powers of a rational number modulo 1 cannot lie in a small interval, Acta Arith. 137 (2009), 
233-239. 

[32] P. Erdos, I. Joo, V. Komornik, Characterization of the unique expansions 1 = Q~ ni an d related prob- 
lems, Bull. Soc. Math. France 118 (1990), 377-390. 

[33] L. Flatto, J. C. Lagarias, A. D. Pollington, On the range of fractional parts {£,(p/q) n }, Acta Arith. 70 
(1995), 125-147. 

[34] J.-M. Gambaudo, O. Lanford, C. Tresser, Dynamique symbolique des rotations, C. R. Acad. Sci. Paris, 
Ser. I 299 (1984), 823-826. 

[35] S. Gan, Sturmian sequences and the lexicographic world, Proc. Amer. Math. Soc. 129 (2001), 1445-1451. 

[36] A. Glen, Powers in a class of .4-strict standard episturmian words, in 5th International Conference on 
Words, Universite du Quebec a Montreal, Publications du LaCIM 36 (2005), 249-263, and Theoret. Com- 
put. Sci. 380 (2007), 330-354. 



EXTREMAL PROPERTIES OF (EPI)STURMIAN SEQUENCES AND DISTRIBUTION MODULO 1 25 



A. Glen, A characterization of fine words over a finite alphabet, Theoret. Comput. Sci. 391 (2008), 51-60. 
A. Glen, Order and quasiperiodicity in episturmian words, in Proceedings of the 6th International Con- 
ference on Words, Marseille, France, September 17-21, 2007, pp. 144-158. 
A. Glen, J. Justin, Epistumian words: A survey, Theor. Inform. Appl. 43 (2009), 403-442. 
A. Glen, J. Justin, G. Pirillo, Characterizations of finite and infinite episturmian words via lexicographic 
orderings, European J. C'ombin. 29 (2008), 45-58. 

A. Glen, F. Leve, G. Richomme, Directive words of episturmian words: equivalences and normalization, 
Theor. Inform. Appl. 43 (2009), 299-319. 

P. Glendinning, C. Sparrow, Prime and renormalisable kneading invariants and the dynamics of expanding 
Lorenz maps, (Homoclinic chaos (Brussels, 1991)), Physica D 62 (1993), 22-50. 

G. A. Hedlund, Sturmian minimal sets, Amer. J. Math. 66 (1944), 605-620. 

A. Heinis, R. Tijdeman, Characterisation of asymptotically Sturmian sequences, Publ. Math. Debrecen 56 
(2000), 415-430. 

J. H. Hubbard, C. T. Sparrow, The classification of topologically expansive Lorenz maps, Comm. Pure 
Appl. Math. 43 (1990), 431-443. 

O. Jenkinson, Optimization and majorization of invariant measures, Electron. Res. Announc. Amer. Math. 
Soc. 13 (2007), 1-12. 

O. Jenkinson, A partial order on x2-invariant measures, Math. Res. Lett. 15 (2008), 893-900. 

O. Jenkinson, L. Q. Zamboni, Characterisations of balanced words via orderings, Theoret. Comput. Sci. 

310 (2004), 247-271. 

J. Justin, Episturmian morphisms and a Galois theorem on continued fractions, Theor. Inform. Appl. 39 
(2005), 207-215. 

J. Justin, G. Pirillo, Episturmian words and episturmian morphisms, Theoret. Comput. Sci. 276 (2002), 
281-313. 

J. Justin, G. Pirillo, On a characteristic property of Arnoux-Rauzy sequences, Theor. Inform. Appl. 36 
(2002), 385-388. 

J. Justin, G. Pirillo, Episturmian words: shifts, morphisms and numeration systems, Internat. J. Found. 
Comput. Sa. 15 (2004), 329-348. 

J. Justin, L. Vuillon, Return words in Sturmian and episturmian words, Theor. Inform. Appl. 34 (2000), 
343-356. 

H. Kaneko, Distribution of sequences modulo 1, Result. Math. 52 (2008), 91-109. 

G. Keller, M. St. Pierre, Topological and measurable dynamics of Lorenz maps, in Fiedler, B. (ed.), 
Ergodic theory, analysis, and efficient simulation of dynamical systems, Springer- Verlag, Berlin, 2001, 
pp. 333-361. 

K. Keller, Invariant factors, Julia equivalences and the (abstract) Mandelbrot set, Lecture Notes in Math- 
ematics, vol. 1732, Springer- Verlag, Berlin, 2000. 

J. C. Kieffer, Sturmian minimal systems associated with the iterates of certain functions on an interval, 
in Dynamical Systems, Proceedings of the Special Year held at the University of Maryland, College Park, 
1986-1987, Lecture Notes in Mathematics, vol. 1342, Springer-Verlag, Berlin, 1988, pp. 354-360. 
T. Kriiger, S. Schmeling, R. Winkler, L. Q. Zamboni, Dynamics of kneading sequences, Unpublished 
preprint (1999). 

DoYong Kwon, A devil's staircase from rotations and irrationality measures for Liouville numbers, Math. 
Proc. Camb. Philos. Soc. 145 (2008), 739-756. 

R. Labarca, C. G. Moreira, Bifurcation of the essential dynamics of Lorenz maps and applications to 
Lorenz- like flows: contributions to the study of the expanding case, Bol. Soc. Bras. Mat. 32 (2001), 
1107-144. 

R. Labarca, C. G. Moreira, Bifurcation of the essential dynamics of Lorenz maps on the real line and the 
bifurcation scenario for the linear family, Sci. Ser. A Math. Sci. (N.S.) 7 (2001), 13-29. 



26 



JEAN-PAUL ALLOUCHE AND AMY GLEN 



R. Labarca, C. G. Moreira, Essential dynamics for Lorenz maps on the real line and the lexicographical 
world, Ann. Inst. H. Poincare, Anal. Non Lineaire 23 (2006), 683-694. 

R. Labarca, S. Plaza, Bifurcation of discontinuous maps of the interval and palindromic numbers, Bol. 
Soc. Mat. Mexicana (3) 7 (2001), 99-116. 

E. N. Lorenz, Deterministic nonperiodic flow, J. Atmos. Sci. 20 (1963), 130-141. 

M. Lothaire, Algebraic Combinatorics on Words, Encyclopedia of Mathematics and its Applications, 
vol. 90, Cambridge University Press, UK, 2002. 

K. Mahler, An unsolved problem on the powers of 3/2, J. Austral. Math. Soc. 8 (1968), 313-321. 

F. Mignosi, Infinite words with linear subword complexity, Theoret. Comput. Sci. 65 (1989), 221-242. 
F. Mignosi, P. Seebold, Morphismes sturmiens et regies de Rauzy, J. Theor. Nombres Bordeaux 5 (1993), 
221-233. 

F. Mignosi, L. Q. Zamboni, On the number of Arnoux-Rauzy words, Acta Arith. 101 (2) (2002), 121-129. 
M. Morse, G. A. Hedlund, Symbolic dynamics II. Sturmian trajectories, Amer. J. Math. 62 (1940), 1-42. 
J. Nilsson, Sur les nombres mal approximables par les nombres q-adiques, Doctoral Thesis, Lund University, 
LTH, 2007. Available electronically at |http : //tel . archives-ouvertes . f r/tel-00273870/ 1 

G. Pirillo, Inequalities characterizing standard Sturmian words, Pure Math. Appl. 14 (2003), 141-144. 
G. Pirillo, Inequalities characterizing standard Sturmian and episturmian words, Theoret. Comput. Sci. 
341 (2005), 276-292. 

G. Pirillo, Morse and Hedlund's skew Sturmian words revisited, Ann. Comb. 12 (2008), 115-121. 
N. Pytheas Fogg, Substitutions in Dynamics, Arithmetics and Combinatorics, Lecture Notes in Mathe- 
matics, vol. 1794, Springer-Verlag, Berlin, 2002. 

G. Rauzy, Mots infinis en arithmetique, in M. Nivat, D. Perrin (Eds.), Automata on Infinite Words, 
Lecture Notes in Computer Science, vol. 192, Springer-Verlag, Berlin, 1985, pp. 165-171. 
G. Richomme, A local balance property of episturmian words, in Developments in Language Theory 2007, 
Lecture Notes in Computer Science, vol. 4588, Springer-Verlag, 2007, pp. 371-381. 

R. N. Risley, L. Q. Zamboni, A generalization of Sturmian sequences: combinatorial structure and tran- 
scendence, Acta Arith. 95 (2000), 167-184. 

L. Silva, J. Sousa Ramos, Topological invariants and renormalization of Lorenz maps, Physica D 162 
(2002), 233-243. 

C. Sparrow, The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors, Springer-Verlag, New- 
York, 1982. 

R. Tijdeman, On complementary triples of Sturmian bisequences, Indag. Math. 7 (1996), 419-424. 
R. Tijdeman, Intertwinings of Sturmian sequences, Indag. Math. 9 (1998), 113-122. 
P. Veerman, Symbolic dynamics and rotation numbers, Physica A 134 (1986), 543-576. 
P. Veerman, Symbolic dynamics of order- preserving orbits, Physica D 29 (1987), 191-201. 
T. Za'imi, An arithmetical property of powers of Salem numbers, J. Number Theory 120 (2006), 179-191. 
T. Za'imi, On integer and fractional parts of powers of Salem numbers, Arch. Math. 87 (2006), 124-128. 

(J. -P. Allouche) CNRS, LRI, UMR 8623, Universite Paris-Sud, Batiment 490, F-91405 Orsay 
Cedex, FRANCE 

E-mail address: allouche@lri.fr 

Current address: (A. Glen) Department of Mathematics and Statistics, School of Chemical and Mathemat- 
ical Sciences, Murdoch University, Perth, WA 6150, AUSTRALIA 



The Mathematics Institute, Reykjavik University, Kringlan 1, IS-103 Reykjavik, ICELAND 
E-mail address: amy.glen@gmail.com 



